There is tremendous potential in AI, but:
Read more: https://www.statnews.com/
From Recital 71 EU GDPR
,,(71) The data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or similarly significantly affects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention.
Such processing includes ‘profiling’ that consists of any form of automated processing of personal data evaluating the personal aspects relating to a natural person, in particular to analyse or predict aspects concerning the data subject’s performance at work, economic situation, health, personal preferences or interests, reliability or behaviour, location or movements, where it produces legal effects concerning him or her or similarly significantly affects him or her.
However, decision-making based on such processing, including profiling, should be allowed where expressly authorised by Union or Member State law to which the controller is subject, including for fraud and tax-evasion monitoring and prevention purposes conducted in accordance with the regulations, standards and recommendations of Union institutions or national oversight bodies and to ensure the security and reliability of a service provided by the controller, or necessary for the entering or performance of a contract between the data subject and a controller, or when the data subject has given his or her explicit consent.
In any case, such processing should be subject to suitable safeguards, which should include specific information to the data subject and the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.’’
When we think about the interpretability of models we usually distinguish three classes of methods
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
\[ \phi_j = \frac{1}{|P|!} \sum_{\pi \in \Pi} (v(S_j^\pi \cup \{j\}) - v(S_j^\pi)) \]
where \(\Pi\) is a set of all possible permutations of players \(P\) while \(S_j^\pi\) is a set of players that are before player \(j\) in permutation \(\pi\).
\[ \hat\phi_j = \frac{1}{|B|} \sum_{\pi \in B} (v(S_j^\pi \cup \{j\}) - v(S_j^\pi)) \]
Let’s start with local explanations, focused on single point \(x\) and the model prediction \(f(x)\).
Now instead of players, you can think about variables. We will distribute a reward between variables to recognize their contribution to the model prediction \(f(x)\).
age, which means conditioning the data with the condition age=8.class=1st. In the next step, we add fare to the coalition, and so on.
class variable to a coalition with the age variable increases the reward by \(0.086\).
Desired characteristics of explanations (from LIME paper)
The core ideas behind LIME are:
The explanation will be a model \(g\) that approximates the behavior of the complex model \(f\) and is as simple as possible
\[ \hat g = \arg \min_{g \in G} L\{f, g, \pi(x)\} + \Omega(g) \]
where
Explanations can be calculated with a following instructions.
sample_around(x’)similarity(x’, z’[i])K-LASSO(y’, x’, w’)where
similarity – a distance function in the original data spaceK-LASSO – a weighted LASSO linear-regression model that selects K variables
Let’s see how LIME can be used to solve this problem.
Initial settings
Interpretable data space
Sampling around x
Fitting of an interpretable model
How to transform the input data into a binary vector of shorter length?